Barcode identification for single cell genomics
Identifieur interne : 000683 ( Main/Exploration ); précédent : 000682; suivant : 000684Barcode identification for single cell genomics
Auteurs : Akshay Tambe [États-Unis] ; Lior Pachter [États-Unis]Source :
- BMC Bioinformatics [ 1471-2105 ] ; 2019.
Descripteurs français
- KwdFr :
- MESH :
- génétique : ADN.
- Analyse de séquence d'ADN, Génomique, Humains, Séquençage nucléotidique à haut débit.
English descriptors
- KwdEn :
- MESH :
- chemical , genetics : DNA.
- methods : Genomics, High-Throughput Nucleotide Sequencing, Sequence Analysis, DNA.
- Humans.
Abstract
Single-cell sequencing experiments use short DNA barcode ‘tags’ to identify reads that originate from the same cell. In order to recover single-cell information from such experiments, reads must be grouped based on their barcode tag, a crucial processing step that precedes other computations. However, this step can be difficult due to high rates of mismatch and deletion errors that can afflict barcodes.
Here we present an approach to identify and error-correct barcodes by traversing the de Bruijn graph of circularized barcode k-mers. Our approach is based on the observation that circularizing a barcode sequence can yield error-free k-mers even when the size of
We show that for single-cell RNA-Seq circularization improves the recovery of accurate single-cell transcriptome estimates, especially when there are a high number of errors per read. This approach is robust to the type of error (mismatch, insertion, deletion), as well as to the relative abundances of the cells. Sircel, a software package that implements this approach is described and publically available.
The online version of this article (10.1186/s12859-019-2612-0) contains supplementary material, which is available to authorized users.
Url:
DOI: 10.1186/s12859-019-2612-0
PubMed: 30654736
PubMed Central: 6337828
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream Pmc, to step Corpus: 000278
- to stream Pmc, to step Curation: 000278
- to stream Pmc, to step Checkpoint: 000390
- to stream PubMed, to step Corpus: 000667
- to stream PubMed, to step Curation: 000667
- to stream PubMed, to step Checkpoint: 000657
- to stream Ncbi, to step Merge: 002084
- to stream Ncbi, to step Curation: 002084
- to stream Ncbi, to step Checkpoint: 002084
- to stream Main, to step Merge: 000686
- to stream Main, to step Curation: 000683
Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en">Barcode identification for single cell genomics</title>
<author><name sortKey="Tambe, Akshay" sort="Tambe, Akshay" uniqKey="Tambe A" first="Akshay" last="Tambe">Akshay Tambe</name>
<affiliation wicri:level="2"><nlm:aff id="Aff1"><institution-wrap><institution-id institution-id-type="ISNI">0000000107068890</institution-id>
<institution-id institution-id-type="GRID">grid.20861.3d</institution-id>
<institution>Division of Biology and Biological Engineering,</institution>
<institution>California Institute of Technology,</institution>
</institution-wrap>
116 Kerckhoff Laboratory, Pasadena, CA 91125 USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<placeName><region type="state">Californie</region>
</placeName>
<wicri:cityArea>116 Kerckhoff Laboratory, Pasadena</wicri:cityArea>
</affiliation>
</author>
<author><name sortKey="Pachter, Lior" sort="Pachter, Lior" uniqKey="Pachter L" first="Lior" last="Pachter">Lior Pachter</name>
<affiliation wicri:level="2"><nlm:aff id="Aff2"><institution-wrap><institution-id institution-id-type="ISNI">0000000107068890</institution-id>
<institution-id institution-id-type="GRID">grid.20861.3d</institution-id>
<institution>Departments of Biology and Computing & Mathematical Sciences,</institution>
<institution>California Institute of Technology,</institution>
</institution-wrap>
116 Kerckhoff Laboratory, Pasadena, CA 91125 USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<placeName><region type="state">Californie</region>
</placeName>
<wicri:cityArea>116 Kerckhoff Laboratory, Pasadena</wicri:cityArea>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">PMC</idno>
<idno type="pmid">30654736</idno>
<idno type="pmc">6337828</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6337828</idno>
<idno type="RBID">PMC:6337828</idno>
<idno type="doi">10.1186/s12859-019-2612-0</idno>
<date when="2019">2019</date>
<idno type="wicri:Area/Pmc/Corpus">000278</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000278</idno>
<idno type="wicri:Area/Pmc/Curation">000278</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Curation">000278</idno>
<idno type="wicri:Area/Pmc/Checkpoint">000390</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Checkpoint">000390</idno>
<idno type="wicri:source">PubMed</idno>
<idno type="RBID">pubmed:30654736</idno>
<idno type="wicri:Area/PubMed/Corpus">000667</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Corpus" wicri:corpus="PubMed">000667</idno>
<idno type="wicri:Area/PubMed/Curation">000667</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Curation">000667</idno>
<idno type="wicri:Area/PubMed/Checkpoint">000657</idno>
<idno type="wicri:explorRef" wicri:stream="Checkpoint" wicri:step="PubMed">000657</idno>
<idno type="wicri:Area/Ncbi/Merge">002084</idno>
<idno type="wicri:Area/Ncbi/Curation">002084</idno>
<idno type="wicri:Area/Ncbi/Checkpoint">002084</idno>
<idno type="wicri:Area/Main/Merge">000686</idno>
<idno type="wicri:Area/Main/Curation">000683</idno>
<idno type="wicri:Area/Main/Exploration">000683</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a" type="main">Barcode identification for single cell genomics</title>
<author><name sortKey="Tambe, Akshay" sort="Tambe, Akshay" uniqKey="Tambe A" first="Akshay" last="Tambe">Akshay Tambe</name>
<affiliation wicri:level="2"><nlm:aff id="Aff1"><institution-wrap><institution-id institution-id-type="ISNI">0000000107068890</institution-id>
<institution-id institution-id-type="GRID">grid.20861.3d</institution-id>
<institution>Division of Biology and Biological Engineering,</institution>
<institution>California Institute of Technology,</institution>
</institution-wrap>
116 Kerckhoff Laboratory, Pasadena, CA 91125 USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<placeName><region type="state">Californie</region>
</placeName>
<wicri:cityArea>116 Kerckhoff Laboratory, Pasadena</wicri:cityArea>
</affiliation>
</author>
<author><name sortKey="Pachter, Lior" sort="Pachter, Lior" uniqKey="Pachter L" first="Lior" last="Pachter">Lior Pachter</name>
<affiliation wicri:level="2"><nlm:aff id="Aff2"><institution-wrap><institution-id institution-id-type="ISNI">0000000107068890</institution-id>
<institution-id institution-id-type="GRID">grid.20861.3d</institution-id>
<institution>Departments of Biology and Computing & Mathematical Sciences,</institution>
<institution>California Institute of Technology,</institution>
</institution-wrap>
116 Kerckhoff Laboratory, Pasadena, CA 91125 USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<placeName><region type="state">Californie</region>
</placeName>
<wicri:cityArea>116 Kerckhoff Laboratory, Pasadena</wicri:cityArea>
</affiliation>
</author>
</analytic>
<series><title level="j">BMC Bioinformatics</title>
<idno type="eISSN">1471-2105</idno>
<imprint><date when="2019">2019</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>DNA (genetics)</term>
<term>Genomics (methods)</term>
<term>High-Throughput Nucleotide Sequencing (methods)</term>
<term>Humans</term>
<term>Sequence Analysis, DNA (methods)</term>
</keywords>
<keywords scheme="KwdFr" xml:lang="fr"><term>ADN (génétique)</term>
<term>Analyse de séquence d'ADN ()</term>
<term>Génomique ()</term>
<term>Humains</term>
<term>Séquençage nucléotidique à haut débit ()</term>
</keywords>
<keywords scheme="MESH" type="chemical" qualifier="genetics" xml:lang="en"><term>DNA</term>
</keywords>
<keywords scheme="MESH" qualifier="génétique" xml:lang="fr"><term>ADN</term>
</keywords>
<keywords scheme="MESH" qualifier="methods" xml:lang="en"><term>Genomics</term>
<term>High-Throughput Nucleotide Sequencing</term>
<term>Sequence Analysis, DNA</term>
</keywords>
<keywords scheme="MESH" xml:lang="en"><term>Humans</term>
</keywords>
<keywords scheme="MESH" xml:lang="fr"><term>Analyse de séquence d'ADN</term>
<term>Génomique</term>
<term>Humains</term>
<term>Séquençage nucléotidique à haut débit</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en"><sec><title>Background</title>
<p id="Par1">Single-cell sequencing experiments use short DNA barcode ‘tags’ to identify reads that originate from the same cell. In order to recover single-cell information from such experiments, reads must be grouped based on their barcode tag, a crucial processing step that precedes other computations. However, this step can be difficult due to high rates of mismatch and deletion errors that can afflict barcodes.</p>
</sec>
<sec><title>Results</title>
<p id="Par2">Here we present an approach to identify and error-correct barcodes by traversing the de Bruijn graph of circularized barcode k-mers. Our approach is based on the observation that circularizing a barcode sequence can yield error-free k-mers even when the size of <italic>k</italic>
is large relative to the length of the barcode sequence, a regime which is typical single-cell barcoding applications. This allows for assignment of reads to consensus fingerprints constructed from k-mers.</p>
</sec>
<sec><title>Conclusion</title>
<p id="Par3">We show that for single-cell RNA-Seq circularization improves the recovery of accurate single-cell transcriptome estimates, especially when there are a high number of errors per read. This approach is robust to the type of error (mismatch, insertion, deletion), as well as to the relative abundances of the cells. Sircel, a software package that implements this approach is described and publically available.</p>
</sec>
<sec><title>Electronic supplementary material</title>
<p>The online version of this article (10.1186/s12859-019-2612-0) contains supplementary material, which is available to authorized users.</p>
</sec>
</div>
</front>
<back><div1 type="bibliography"><listBibl><biblStruct><analytic><author><name sortKey="Bray, Nl" uniqKey="Bray N">NL Bray</name>
</author>
<author><name sortKey="Pimentel, H" uniqKey="Pimentel H">H Pimentel</name>
</author>
<author><name sortKey="Melsted, P" uniqKey="Melsted P">P Melsted</name>
</author>
<author><name sortKey="Pachter, L" uniqKey="Pachter L">L Pachter</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Compeau, Pec" uniqKey="Compeau P">PEC Compeau</name>
</author>
<author><name sortKey="Pevzner, Pa" uniqKey="Pevzner P">PA Pevzner</name>
</author>
<author><name sortKey="Tesler, G" uniqKey="Tesler G">G Tesler</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Fincher, Ct" uniqKey="Fincher C">CT Fincher</name>
</author>
<author><name sortKey="Wurtzel, O" uniqKey="Wurtzel O">O Wurtzel</name>
</author>
<author><name sortKey="De Hoog, T" uniqKey="De Hoog T">T de Hoog</name>
</author>
<author><name sortKey="Kravarik, Km" uniqKey="Kravarik K">KM Kravarik</name>
</author>
<author><name sortKey="Reddien, Pw" uniqKey="Reddien P">PW Reddien</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Gierahn, Tm" uniqKey="Gierahn T">TM Gierahn</name>
</author>
<author><name sortKey="Wadsworth, Mh" uniqKey="Wadsworth M">MH Wadsworth</name>
</author>
<author><name sortKey="Hughes, Tk" uniqKey="Hughes T">TK Hughes</name>
</author>
<author><name sortKey="Bryson, Bd" uniqKey="Bryson B">BD Bryson</name>
</author>
<author><name sortKey="Butler, A" uniqKey="Butler A">A Butler</name>
</author>
<author><name sortKey="Satija, R" uniqKey="Satija R">R Satija</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Karaiskos, N" uniqKey="Karaiskos N">N Karaiskos</name>
</author>
<author><name sortKey="Wahle, P" uniqKey="Wahle P">P Wahle</name>
</author>
<author><name sortKey="Alles, J" uniqKey="Alles J">J Alles</name>
</author>
<author><name sortKey="Boltengagen, A" uniqKey="Boltengagen A">A Boltengagen</name>
</author>
<author><name sortKey="Ayoub, S" uniqKey="Ayoub S">S Ayoub</name>
</author>
<author><name sortKey="Kipar, C" uniqKey="Kipar C">C Kipar</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Klein, Am" uniqKey="Klein A">AM Klein</name>
</author>
<author><name sortKey="Mazutis, L" uniqKey="Mazutis L">L Mazutis</name>
</author>
<author><name sortKey="Akartuna, I" uniqKey="Akartuna I">I Akartuna</name>
</author>
<author><name sortKey="Tallapragada, N" uniqKey="Tallapragada N">N Tallapragada</name>
</author>
<author><name sortKey="Veres, A" uniqKey="Veres A">A Veres</name>
</author>
<author><name sortKey="Li, V" uniqKey="Li V">V Li</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Liu, Y" uniqKey="Liu Y">Y Liu</name>
</author>
<author><name sortKey="Schroder, J" uniqKey="Schroder J">J Schroder</name>
</author>
<author><name sortKey="Schmidt, B" uniqKey="Schmidt B">B Schmidt</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Macosko, Ez" uniqKey="Macosko E">EZ Macosko</name>
</author>
<author><name sortKey="Basu, A" uniqKey="Basu A">A Basu</name>
</author>
<author><name sortKey="Satija, R" uniqKey="Satija R">R Satija</name>
</author>
<author><name sortKey="Nemesh, J" uniqKey="Nemesh J">J Nemesh</name>
</author>
<author><name sortKey="Shekhar, K" uniqKey="Shekhar K">K Shekhar</name>
</author>
<author><name sortKey="Goldman, M" uniqKey="Goldman M">M Goldman</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Patro, R" uniqKey="Patro R">R Patro</name>
</author>
<author><name sortKey="Mount, Sm" uniqKey="Mount S">SM Mount</name>
</author>
<author><name sortKey="Kingsford, C" uniqKey="Kingsford C">C Kingsford</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Plass, M" uniqKey="Plass M">M Plass</name>
</author>
<author><name sortKey="Solana, J" uniqKey="Solana J">J Solana</name>
</author>
<author><name sortKey="Wolf, Fa" uniqKey="Wolf F">FA Wolf</name>
</author>
<author><name sortKey="Ayoub, S" uniqKey="Ayoub S">S Ayoub</name>
</author>
<author><name sortKey="Misios, A" uniqKey="Misios A">A Misios</name>
</author>
<author><name sortKey="Glazar, P" uniqKey="Glazar P">P Glažar</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Rosenberg, Ab" uniqKey="Rosenberg A">AB Rosenberg</name>
</author>
<author><name sortKey="Roco, C" uniqKey="Roco C">C Roco</name>
</author>
<author><name sortKey="Muscat, Ra" uniqKey="Muscat R">RA Muscat</name>
</author>
<author><name sortKey="Kuchina, A" uniqKey="Kuchina A">A Kuchina</name>
</author>
<author><name sortKey="Mukherjee, S" uniqKey="Mukherjee S">S Mukherjee</name>
</author>
<author><name sortKey="Chen, W" uniqKey="Chen W">W Chen</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Schaeffer, L" uniqKey="Schaeffer L">L Schaeffer</name>
</author>
<author><name sortKey="Pimentel, H" uniqKey="Pimentel H">H Pimentel</name>
</author>
<author><name sortKey="Bray, N" uniqKey="Bray N">N Bray</name>
</author>
<author><name sortKey="Mellsted, P" uniqKey="Mellsted P">P Mellsted</name>
</author>
<author><name sortKey="Pachter, L" uniqKey="Pachter L">L Pachter</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Svensson, V" uniqKey="Svensson V">V Svensson</name>
</author>
<author><name sortKey="Natarajan, Kn" uniqKey="Natarajan K">KN Natarajan</name>
</author>
<author><name sortKey="Ly, L H" uniqKey="Ly L">L-H Ly</name>
</author>
<author><name sortKey="Miragaia, Rj" uniqKey="Miragaia R">RJ Miragaia</name>
</author>
<author><name sortKey="Labalette, C" uniqKey="Labalette C">C Labalette</name>
</author>
<author><name sortKey="Macaulay, Ic" uniqKey="Macaulay I">IC Macaulay</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Tosches, Ma" uniqKey="Tosches M">MA Tosches</name>
</author>
<author><name sortKey="Yamawaki, Tm" uniqKey="Yamawaki T">TM Yamawaki</name>
</author>
<author><name sortKey="Naumann, Rk" uniqKey="Naumann R">RK Naumann</name>
</author>
<author><name sortKey="Jacobi, Aa" uniqKey="Jacobi A">AA Jacobi</name>
</author>
<author><name sortKey="Tushev, G" uniqKey="Tushev G">G Tushev</name>
</author>
<author><name sortKey="Laurent, G" uniqKey="Laurent G">G Laurent</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Trapnell, C" uniqKey="Trapnell C">C Trapnell</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Zhang, Z" uniqKey="Zhang Z">Z Zhang</name>
</author>
<author><name sortKey="Wang, W" uniqKey="Wang W">W Wang</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Zorita, E" uniqKey="Zorita E">E Zorita</name>
</author>
<author><name sortKey="Cusc, P" uniqKey="Cusc P">P Cuscó</name>
</author>
<author><name sortKey="Filion, Gj" uniqKey="Filion G">GJ Filion</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<affiliations><list><country><li>États-Unis</li>
</country>
<region><li>Californie</li>
</region>
</list>
<tree><country name="États-Unis"><region name="Californie"><name sortKey="Tambe, Akshay" sort="Tambe, Akshay" uniqKey="Tambe A" first="Akshay" last="Tambe">Akshay Tambe</name>
</region>
<name sortKey="Pachter, Lior" sort="Pachter, Lior" uniqKey="Pachter L" first="Lior" last="Pachter">Lior Pachter</name>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000683 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000683 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Sante |area= MersV1 |flux= Main |étape= Exploration |type= RBID |clé= PMC:6337828 |texte= Barcode identification for single cell genomics }}
Pour générer des pages wiki
HfdIndexSelect -h $EXPLOR_AREA/Data/Main/Exploration/RBID.i -Sk "pubmed:30654736" \ | HfdSelect -Kh $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd \ | NlmPubMed2Wicri -a MersV1
This area was generated with Dilib version V0.6.33. |